# Low Power Soc Communication Using Steiner Graph AMBA Architecture

#### Krishna Kumar.S

Abstract—Reduction in size of a chip leads to more sophisticated design, but the system components must be designed to operate with considerably low power. Altering the data and control path in a design will reduce the power dissipation in a chip to a greater extent. The limitation of an normal bus design is that data or control values will be transmitted to all the Soc components invariably, thus the chip select control lies within the components, either single or many circuit will be activated with the reference to control signal but the transmission cost is high due to the unnecessary passage of signal in the bus path. Steiner graph method of predicting the best bus routing path combined with Gated clock tree structure, which will further make the design more flexible. The bus implementation is done by using AHB Bus protocol for its robust and flexible design in nature. Thus a low power operating Soc can be designed and implemented without modifying the structure of component but by altering bus path structure. The level of power consumption can be further reduced by modifying the structure of the components too.

\_\_\_\_\_

Index Terms— Communication Architecture, Low Power Design, System-On-Chip, Steiner Graph, Clock Gating.

# **1 INTRODUCTION**

The feature size of process technology scales down. The SOC are capable of integrating more and gaining higher complexity. The clock frequency is limiting due to power and thermal limitations, better performance is achieved through parallelism. The on-chip communication become critical in future due to communication latency and bandwidth components become a task the power consumption on inter component communication has scaled up to a significant level. The existing onchip bus standards can provide an interface for IP developers and for system designers. Compared to network-on-chip type of communication buses are small on silicon footprint, fast in terms of latency easy to implement[2]. Further the and implementation enabling the designers to apply various optimizations for best performance with available resources. The present bus architecture are not power efficient to transfer on bus lines. When high bandwidth is required on these buses the wire efficiency will be low which limits system bandwidth capacity.

The proposed synthesis scheme for on-chip buses eliminates the disadvantages in existing bus system without changing the existing protocols and component interfaces are based on shortest path Steiner graphs bus lines are maximized which needs no modified design of system components and IP modules. Low power is maintained with reduced routing resources. The technology trend welcomes this physical synthesis scheme for bringing a large improvement on power performance based on current on-chip buses and bus matrices. To evaluate the system punctuality a communication constraint graph is extracted from a specific application on which topology configuration has an estimated performance by analysis and simulation [3]. An elaborate power analysis on AMBA on-chip bus is performed. where the detailed decomposition of power consumed by system components is obtained by simulation on NEC'S gate-level power estimator. Clock gating is nowadays used to reduce dynamic power and power gating is used to avoid unnecessary static power. In bus communications, a large part of power is consumed on the wires of bus lines.

<sup>•</sup> krishna kumar.S is currently pursuing masters degree program in Embedded system technologies in Veltech multitech Dr.RR and Dr.SR Engineering college affiliated to Annauniversity, India, PH-+919943812995. E-mail:Krishna.veltech@gmail.com

A power performance tradeoff is analyzed on bus matrices, where a bus matrix is composed of a set of tree structured buses[4]. These structures from trees to graphs, using Steiner graph connection for optimization of bus gating to minimize the communication power. On the power side communication power depends on the wire capacitance in data transaction and the performance side, delay is dominated by signal propagation distance bandwidth is limited by routing congestion and resulted in power thermal constraints.

The rest of the paper is organized as follows. Section2 first introduces Bus matrix design. Next, the proposed power reduction using bus gating technique combined with Steiner graphs is described in Section3. Section 4 then presents Implementation. Section 5 then concludes this paper.

#### 2 BUS MATRIX DESIGN

In this paper we look at bus matrix based communication architectures which are currently being considered by designers to meet the high bandwidth requirements of modern Soc systems. Fig.1 shows an example of a three-master sevenslave AMBA bus matrix architecture for a dual ARM processor based networking subsystem application.



#### Fig.1. Full bus matrix architecture

A bus matrix consists of several busses in parallel which can support concurrent high bandwidth data streams. The Input stage is used to handle interrupted bursts, and to register and hold incoming transfers if receiving slaves cannot accept them immediately[6]. The Decode stage generates select signal for appropriate slaves. Unlike in traditional shared bus architectures, arbitration in a bus matrix is not centralized, but rather

distributed so that every slave has its own arbitration. One drawback of the full bus matrix structure shown in Fig. 1 is that it connects every master to every slave in the system, resulting in a prohibitively large number of busses in the matrix. The excessive wire congestion can make it practically impossible to route and achieve timing closure for the design.

## 3 POWER REDUCTION USING BUS GATING TECHNIQUE

Standard on-chip buses like AMBA were designed to enable fast and convenient integration of system components into the Soc, where simplicity is one of the major objectives. When the bus power consumption comes to a significant level that we cannot afford to ignore, power optimization will be desirable. We introduce a "bus gating" technique to minimize the power on bus lines with a small compromise on design simplicity[5]. The power efficiency of bus architecture like is low because the bus lines from masters to slaves are connecting all the slave devices by a single large wire net. The same is on slave-to-master connections. While the communication is one-to-one, the signals are sent to all the receivers regardless of whether they are needed, which results in wasted dynamic power on bus wires and component interfaces as shown in fig2. For tree structured buses distributing the multiplexer and de-multiplexer into the wire net helps to save both power and wires.



Fig.2. Bus gating using distributed mux and demux.(a)On single bus.(b)On bus matrix.

Arborescence is a directed tree such that every root-to-leaf path is shortest. On the receiver side with distributed de-multiplexers, the bus lines change from a rectilinear Steiner minimum tree to minimum rectilinear Steiner arborescence (MRSA).

By the research this change increases the wire length by only 2–4% on average. So the total bus wire length can be reduced by the distributing the multiplexer/de-multiplexers, while the dynamic power can also be reduced at the same time[1]. There is a small control overhead for sending the signals over the arborescence, but compared to the bus width and data throughput, this dynamic power overhead is negligible.

Based on the same tree topology, effective bus gating can be applied by distributing the control over the entire tree. On bus matrices however simply adding de-multiplexers may increase the total wire length, because when the number of master-to-slave paths becomes large, each path will need its own bus wires.

To reduce wire length in the bus matrix, also to further reduce power on the basic bus, we adopt the structures of Steiner graphs. A Steiner graph is a generalization of Steiner trees, without the limitation of tree structure that there is only one root placed at a certain point, which cannot be on the shortest path of every connection[7]. By removing the constraint of tree topologies, we gain higher freedom to choose shortest paths for reduced power on data transactions, and to let the paths share wires for reduced routing congestion. As defined in for an un weighted graph G=(V,E),  $G=(V,E, \omega)$  is a Steiner graph of G if  $V \subseteq V$  and for any pair of vertices u,  $w \in V$ , the distance between them in G is at least the distance between them in G. Fig.3 shows a Steiner graph of G with V= {s1,s2,t1,t2}, E={(s1,t1),(s1,t2),(s2,t1),(s2,t2)} with each edge weighted 1, and its implementation as a bus or bus matrix.

This graph is minimal in terms of total wire length. Moreover, every edge in E has a path in G with minimum length, i.e., the path length equals the Manhattan distance between the two vertices.

In this way, each data transaction involves minimal wires, leading to minimal dynamic power on bus lines.

Shortest-path steiner graphs have advantages on power efficiency as shown above. . Naturally graph structures also have advantage on communication bandwidth over trees[7].



# Fig. 3. Shortest-path Steiner graph G" and its Bus implementation

Our objective is bus gating and bus matrix synthesis is to perform a balanced optimization on power and bandwidth even when available routing resource is limited.

### **4 IMPLEMENTATION**

#### A) Steiner Graph

The Steiner tree problem or the minimum Steiner tree problem is a problem in combinational optimization which may be formulated in a

USER © 2012 http://www.ijser.org number of settings with the common part being that it is required to find the shortest interconnect for a given set of objects. The Steiner tree problem is superficially similar to the minimum spanning tree problem given a set V of points (vertices), interconnect them by a network graph of shortest length, where the length is the sum of the lengths of all edges. The difference between the Steiner tree problem and the minimum spanning tree problem is that, in the Steiner tree problem, extra intermediate vertices and edges may be added to the graph in order to reduce the length of the spanning tree[8]. These new vertices introduced to decrease the total length of connection are known as Steiner points or Steiner vertices.

It has been proved that the resulting connection is a tree known as the Steiner tree. There may be several Steiner trees for a given set of initial vertices. The Steiner tree problem has applications in circuit layout or network design.

#### **B)** Clocked Gate

A clock gated is an justify architecture in which the unnecessary path or power losses will be controlled with much effectively like utilizing gated driver tree structure which can exactly control the transmission path inside a bus.So that it will not transmit unnecessary nodes. It named as clock gated driver because the clock also will not be transmitted to other nodes[5]. The clock utilization for transmission be through the destination not unnecessary node path, in further we will reduce the power consumption in existing work. To save area, the memory module of a delay buffer is often in the form of static ram array with input/output data bus. Special read/write circuitry, such as a sense amplifier, is needed for fast and low-power operations. However of all the memory cells, only two words will be activated. Driving the input signal all the way to all the memory cells seems to be waste of power. This can be avoided in Steiner graph combined with clocked gating technique.

# C) Steiner Graph Combined With Clock Tree Gating



#### Fig.4. Steiner graph combined with clock tree gating

The bus architecture has a limitation of transmitting or passing the signal throughout network structure so all the nodes in the network will receive the input data and the particular node which is used or which decide as the destination will assign the data and will acknowledge the data other nodes will simply ignore the data which does not have the destination address so there will be unnecessary path delay.

Unnecessary path loss in transmitting the data through the unnecessary nodes [5]-[9]. So in our proposed design in fig.4 we are modifying the bus structure so that it can deliver only to the exact destination node. It will reduce the power loss, path metric and have lot of advantage through bus design. The prediction of the exact node or shortest path of node will be using Steiner graph which is used to find the exact path metric and used to find the exact bus destination occurs. Some of the advantages of proposed system in clock tree architecture will be introduced to easily access of a particular memory block. We can easily navigate the Bus structure of Bus path to select and deselect one or more path parallel and Bus utilization is reduced and so power is minimized.

### 5 CONCLUSION

In this paper we propose a physical synthesis scheme for on-chip buses and bus matrices to

minimize the power consumption, without changing the interface or arbitration protocols. By using a bus gating technique, data transactions can take shortest paths on chip, reducing the power consumption of bus wires to minimal. Routing resource and bandwidth capacity are also optimized by the construction of a shortest-path Steiner graph, wire sharing among multiple data transactions and wire reduction heuristics on the Steiner graph. Experiments indicate that the gated bus from our synthesis flow can save more than 90% dynamic power on average data transactions in current AMBA bus systems.

### REFERENCES

[1] RenshenWang, Yulei Zhang, Nan-Chi Chou, Evageline, Ronald "Bus matrix synthesis based on steiner graph for power efficient soc communication," in IEEE Transaction. Computer-Aided Design, 2011, pp. 167-179.

[2] W. Dally and B. Towles, "Route packets, not wires: On-chip intercon-nection network," in Proc. ACM/IEEE Des. Autom. Conf., Jun. 2001, pp. 684–689.

[3] K.Lahiri, A.Raghunathan, and S. Dey, "Efficient exploration of the SoC communication architecture

design space," in Proc. Int. Conf. Comput.-Aided Design, 2000, pp. 424–430.

[4] S.Pasricha, Y.-H. Park, F. J. Kurdahi, and N. Dutt, "System-level power performance tradeoffs in bus matrix communication architecture synthesis," in Proc. Int. Conf. Hardw.-Softw. Codesign Syst. Synthesis, 2006, pp. 300–305.

[5] L.A.Ca, Q.Wu, M.Pedram and X.Wu, "Clockgating and its application to low power design of sequential circuits," in Proc. IEEE Custom Integr. Circuits Conf., vol. 47. Mar. 2000, pp. 415–420.

[6] Sudeep Pasricha, Nikil Dutt and Mohamed Ben-Romdhane, "Bus matrix communication architecture synthesis," in CECS Conf, 2005

[7] W.Shi and S.Chen, "The rectilinear Steiner arborescence problem is np complete," in Proc. ACM-SIAM Symp. Discrete Algorithms, 2000, pp. 780–787.

[8] R.Wang ,N.C.Chou, B.Salefski, and C.K. Cheng, "Low power gated bus synthesis using shortestpath Steiner graph for system-on-chip communications," in Proc. ACM/IEEE Des. Autom. Conf., Jul. 2009, pp. 166–171.

[9] Po-Chung, Jing-Siang Jhuang, Pei-Yun Tsai, "A low power delay buffer using gated driver tree", in IEEE Transaction. Very Large Scale Integration,2009